" Unit 3 - Lecture 4 "
"------------------------------------------------------------------------"

" Hypothesis Testing "

"
Case 1:
One Population Mean Test
"
"
Hypothesis:

H0: Mu = Mu_0 v/s H1: Mu <>!= Mu_0

Define: alpha

Under H0,
TS: (x_bar - Mu_0) / S.E(x_bar)

DC: Compare Tab and Calculated value
Reject H0 if p-value < alpha


1.) Xi ~ N
2.) SD is unknown

"

# Syntax:
t.test(x = ,y = ,alternative = ,conf.level = ,
       mu = ,var.equal = ,paired = )

"
Claims from a Motor Insurance Portfolio is
given below:
23085, 24143, 22156, 19079, 23874, 
27207, 23003, 26972, 24740, 22247

Test whether the mean claim amount is 10,000.

"

# H0: Mu = 10,000 v/s
# H1: Mu != 10,000

alpha = 0.05

X = c(23085, 24143, 22156, 19079, 23874, 
        27207, 23003, 26972, 24740, 22247)

# Manual
T.S = (mean(X) - 10000) / sqrt(var(X) / length(X))

abs(T.S) > qt(alpha / 2,
              length(X) - 1,
              lower.tail = F)

pt(abs(T.S),
   length(X) - 1,
   lower.tail = F) * 2


# Using Function
t.test(X,alternative = "greater",
       conf.level = 0.95,
       mu = 10000)

# Comment:
"Since p-value < 5%, we Reject H0.
This means that the mean claim amount
in the motor insurance portfolio is NOT
equal to Rs.10,000"


"
Exam Question:
"


"------------------------------------------------------------------------"

"
Case 2:
Two Sample Mean Test
"
"
Hypothesis:

H0: Mu_1 = Mu_2 v/s H1: Mu_1 <>!= Mu_2

Define: alpha

Under H0,

DC: Compare Tab and Calculated value
Reject H0 if p-value < alpha


Assumptions:
1.) Xij ~ N
2.) Sigma1, Sigma2 are unknown
3.) Sigma1 = Sigma2

"




"------------------------------------------------------------------------"

"
Case 3:
One Population Variance Test
"
"
Hypothesis:

H0: Sigma^2 = Sigma_0^2 v/s 
H1: Sigma^2 <>!= Sigma_0^2

Define: alpha

Under H0,
TS: (n - 1)*S^2 / Sigma_0^2

DC: Compare Tab and Calculated value
Reject H0 if p-value < alpha

"

"
Question:
In a bottle cap manf. factory,
below are the 100 values of the diameter
of the bottle cap.

Check if the S.D is more than 2 cm.
"
set.seed(10)
Cap.Diameter = rnorm(100,
                     5,
                     sample(seq(1,5,0.25),1))

# H0: Sigma^2 = 2^2
# H1: Sigma^2 > 2^2

alpha = 0.05
n = length(Cap.Diameter)
Sigma_0 = 2


T.CAL = (n - 1) * var(Cap.Diameter) / 
          Sigma_0^2

TAB = qchisq(alpha,n - 1,
             lower.tail = F)

T.CAL > TAB

# Finding p-value
pchisq(T.CAL,n - 1,
       lower.tail = F)

"------------------------------------------------------------------------"

" 
Case 4:
Two Population Variance "
"
Hypothesis:

H0: Sigma_1^2 = Sigma_2^2 
v/s 
H1: Sigma_1^2 <>!= Sigma_2^2

Define: alpha

Under H0,
TS: S_1^2 / S_2^2

DC: Compare Tab and Calculated value
Reject H0 if p-value < alpha

"

# Syntax:
var.test(x = ,
         y = ,
         alternative = ,
         conf.level = )


# Example: Time Taken.csv

var.test(x = DATA$JUNIOR,
         y = DATA$DEGREE,
         alternative = "two.sided",
         conf.level = 0.95)

"------------------------------------------------------------------------"

" Exercise "

" 
1.)
It is believed that the mean precision of an
instrument is 2.1. At 1% l.o.s, test whether
the precision is 2.1 or more.

Data = 2.5, 2.3, 2.4, 2.3, 2.5, 2.7,
       2.5, 2.6, 2.7, 2.7, 2.5
"

Data = c(2.5, 2.3, 2.4, 2.3, 2.5, 2.7,
         2.5, 2.6, 2.7, 2.7, 2.5)

# H0: Mu = 2.1 v/s H1: Mu > 2.1

t.test(Data,
       conf.level = 0.99,
       alternative = "greater",
       mu = 2.1)


"
2.)
Import the dataset Time Taken.
The dataset represents the time taken
by 2 group of students to solve a problem
Can we conclude that both the groups have 
taken equal time on an average to solve the
problem.

"

DATA = read.csv(file.choose())

# H0: Mu1 - Mu2 = 0
# H1: Mu != Mu2

t.test(DATA$JUNIOR,
       DATA$DEGREE,
       var.equal = T)

# Comment


"
3.)
A truck firm wants to know whether they
can market their product as the best.
(Best indicates average tyre life is
than 28,000 miles)
On a sample of 40 tyres, the mean lifetime
was 27,563 miles and sd to be 1,348 miles
Conclude at 0.01 l.o.s

"


"
4.)
Ram and Karim just completed an investigation 
which required the application of a 
two - sampled t-test to compare two independent
samples each of size 11. 

Their data was stored in vectors males and females : 
Run the below code to load them in your R session :

Males <- c(21,22,28,27,20,23,26,32,25,21,30)
Females <- c(19,18,38,33,24,39,22,29,28,26,30)

a.)
Conduct an appropriate test in R to
show if the average of both the population is
equal. 
Your output should contain the 
alternative hypothesis and the 
p - value for the test.

Upon discussion, 
they found that they had missed to conduct an test to 
check their equal variance assumption 
required for this test. 

b.)
Comment on the validity of the test 
conducted by Karim prior to a.)

"

"a.)"
# H0: Mu_1 = Mu_2
# H1: Mu_1 != Mu_2

Males <- c(21,22,28,27,20,23,26,32,25,21,30)
Females <- c(19,18,38,33,24,39,22,29,28,26,30)

t.test(Males,
       Females,
       var.equal = T)


"b.)"
# H0: Sigma_1^2 = Sigma_2^2
# H1: Sigma_1^2 != Sigma_2^2

var.test(Males,
         Females)


"------------------------------------------------------------------------"

"

Understand:
- Chi-Sq GoF Test
- Paired data

"
"------------------------------------------------------------------------"

"
Case 5:
Paired t-Test

"
"
Hypothesis:

H0: Diff = Del v/s H1: Diff <>!= Del

Define: alpha

Under H0,
TS: (D_Mean - Diff) / S.E(x_bar)

DC: Compare Tab and Calculated value
Reject H0 if p-value < alpha


"

"
Example:
A clinic wants to test whether or not 
there is no difference in the diabetic
scores before and after lunch.
Conclude the test, at 5% l.o.s "

set.seed(1101)
Before.Lunch = runif(10,150,250)
After.Lunch = Before.Lunch + runif(10,-20,20)

# H0: d_bar = 0
# H1: d_bar != 0

t.test(Before.Lunch,
       After.Lunch,
       paired = T)


"------------------------------------------------------------------------"
